<html>
<head><meta charset="utf-8"><title>Lazy DefPathTable decoding · t-compiler/wg-incr-comp · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/index.html">t-compiler/wg-incr-comp</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html">Lazy DefPathTable decoding</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="223896964"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/223896964" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> mw <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#223896964">(Jan 25 2021 at 13:33)</a>:</h4>
<p>Hey <span class="user-mention" data-user-id="125294">@Aaron Hill</span>and  <span class="user-mention" data-user-id="306073">@Tyson Nottingham</span>, since you've both been doing work in this area I wanted to give you a heads up that I'm experimenting with an alternative implementation of lazy DefPathTable decoding along the lines this description: <a href="https://github.com/rust-lang/rust/pull/80177#issuecomment-749484088">https://github.com/rust-lang/rust/pull/80177#issuecomment-749484088</a></p>
<p>I have a promising implementation of a hashtable that can be used directly in its binary format (i.e. without any upfront decoding) and making DefPaths "structured" turned out to be pretty simple. Overall this seems to simplify things quite a bit. The on-disk hashtable adds some complexity -- but it is a generic data structure that might come in handy for other use cases too.</p>
<p>Let me know if you'd like to get a ping when things have progressed far enough for a performance testing PR.</p>



<a name="223944244"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/223944244" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Tyson Nottingham <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#223944244">(Jan 25 2021 at 19:03)</a>:</h4>
<p>Sure! Sounds neat. :)</p>



<a name="224095977"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/224095977" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> mw <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#224095977">(Jan 26 2021 at 20:12)</a>:</h4>
<p>I just created a public repo for the hash table implementation at <a href="https://github.com/michaelwoerister/odht">https://github.com/michaelwoerister/odht</a>.</p>



<a name="224096165"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/224096165" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> mw <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#224096165">(Jan 26 2021 at 20:13)</a>:</h4>
<p>It uses robin hood hashing (i.e. what libstd used before HashBrown) and seems to perform pretty well (although HashBrown can be up to 2x faster)</p>



<a name="224096263"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/224096263" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> mw <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#224096263">(Jan 26 2021 at 20:14)</a>:</h4>
<p>It should not be too hard to change the underlying implementation in the future, if needed</p>



<a name="224097633"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/224097633" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Tyson Nottingham <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#224097633">(Jan 26 2021 at 20:25)</a>:</h4>
<p>Nice. I'll try to check it out sometime this week and let you know if I run into any problems.</p>



<a name="224159639"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/224159639" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> mw <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#224159639">(Jan 27 2021 at 09:56)</a>:</h4>
<p>Great. I'm interested in any kind of feedback, especially concerning the public API.</p>



<a name="224763034"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/224763034" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> mw <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#224763034">(Feb 01 2021 at 17:39)</a>:</h4>
<p>I just opened <a href="https://github.com/rust-lang/rust/pull/81635">https://github.com/rust-lang/rust/pull/81635</a>, which implements the "structured DefPathHash" pre-requisite.</p>



<a name="224955186"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/224955186" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Tyson Nottingham <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#224955186">(Feb 02 2021 at 23:59)</a>:</h4>
<p><span class="user-mention" data-user-id="124287">@mw</span>, I haven't read the whole implementation, but I can give a little feedback on the ODHT API. By the way, is your intent for it to be general purpose and possibly release it on <a href="http://crates.io">crates.io</a>, or for it to be purpose-built for DefPathTable decoding? I might be less inclined to nitpick if it's the latter.</p>
<p>Now to feedback:</p>
<ul>
<li>
<p>Is it possible yet to use const generics to enforce constraint that RawKey and RawValue are byte arrays? If not, is it possible to easily enforce it some other way?</p>
</li>
<li>
<p>I don't think it's possible for clients of odht to implement ByteArray directly for byte arrays as suggested since both the trait and byte arrays themselves are foreign. You can newtype the byte arrays and impl ByteArray on that I suppose. Requiring ByteArray: Default adds another wrinkle for byte arrays longer than 32 elements. In any case, not a big deal if we just care about the DefPathTable issue. By the way, ByteArray also isn't implementable by clients because it's currently in a private module.</p>
</li>
<li>
<p>Raw is a frequently overloaded term. Is it worth using something more descriptive, albeit longer? Maybe Encoded?</p>
</li>
<li>
<p>The documentation with respect to endianness or platform-independence could mention that the ODHT implementation supports being able to encode on, for example, an LE platform and correctly decode on a BE platform, <em>but</em> (I believe) that it depends on the provided hashing and encoding functions working in a platform-independent way to do that. And similarly for moving between 32-bit vs 64-bit platforms.</p>
</li>
</ul>
<p>Overall, seems good!</p>



<a name="225000617"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/225000617" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> mw <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#225000617">(Feb 03 2021 at 11:03)</a>:</h4>
<p>Awesome, thank you for the feedback, <span class="user-mention" data-user-id="306073">@Tyson Nottingham</span>!</p>
<ul>
<li>Yes, I think it would make sense to release the crate on <a href="http://crates.io">crates.io</a> (similar to the <code>rustc_hash</code> crate). It's much easier to work on a standalone crate than on something that is part of the compiler (because of build times). It also encourages loose coupling.</li>
<li>It don't think it's possible to use const generics yet. At least the compiler gave me error messages when I tried to. Luckily I remembered that the <code>smallvec</code> crate had found a workaround (i.e. use fixed size arrays), so I used that. I haven't tried hard to find a different way of enforcing that <code>RawKey</code> and <code>RawValue</code> are fixed size arrays. </li>
<li>As I see it, clients are not really supposed to implement <code>ByteArray</code>. Rather it's the library's job to provide implementations up to a reasonable size (as it is the case for the <code>smallvec</code> crate, I think). I wonder if that is too restrictive. We could switch from the <code>ByteArray</code> trait to an <code>Encoded</code> trait with a single <code>fn as_le_bytes_with_no_padding(&amp;self) -&gt; &amp;[u8]</code> method (and enforce that the type has an alignment of 1). That would allow for types like <code>PackedFingerprint</code> to be used directly on little-endian platforms. But I am not sure if that is worth the trouble. LLVM is able to basically optimize away the encoding/decoding step if the byte representation is the same. And I like that fixed size byte arrays make it hard to get the alignment wrong and make it rather explicit what's expected.</li>
<li>Yes, <code>Encoded</code> is better than <code>Raw</code>, especially for the type names in the <code>Config</code> trait. For <code>RawTable</code> it makes less sense, maybe. <code>RawTable</code> is called that because it works only with raw bytes and does not know about what types are encoded in them.</li>
</ul>



<a name="225001961"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/241847-t-compiler/wg-incr-comp/topic/Lazy%20DefPathTable%20decoding/near/225001961" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/241847-t-compiler/wg-incr-comp/topic/Lazy.20DefPathTable.20decoding.html#225001961">(Feb 03 2021 at 11:18)</a>:</h4>
<p>min_const_generics are stable on nightly right now so should be possible to use, but maybe not for this use case</p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>